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ABSTRACT 

Arguments about trade-offs existing between teaching 
and research affect much of the communication discipline as scholars 
engage in arguments about the future directions of departments. A 
study summarized more than 40 quantitative studies and found a small 
heterogeneous positive correlation between teaching effectiveness and 
research productivity. Positive teaching evaluations correlate with 
increased research productivity. While the finding should not be 
interpreted as direct evidence of any causality between the 
variables, the evidence points to an association that deserves 
consideration. While the correlation is small, the association 
remains positive, suggesting that research productivity does not 
necessarily contradict efforts at quality teaching. The finding 
warrants a more thorough understanding of those features associated 
with both increased research productivity and positive teaching 
evaluations. Building a strong and effective communication department 
requires that the philosophical underpinnings take into account the 
requirements and potential trade-offs between research productivity 
and teaching effectiveness. Contains 4 notes, 120 references, and 3 
tables of data.) (RS) 
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ABSTRACT 

RESEARCH ERDDUCT1V11Y AND POSITIVE TEACHING EVAUJATIONS: 
EXAMINING THE RELATIONSHIP USING META-ANALYSIS 



A long-terra controversy surrounds whether college faculty face a 
trade-off between producing research and. offering quality instruction. 
The debate assumes that a college professor cannot combine excellence in 
teaching and research. This summary of more than 40 quantitative studies 
found a small heterogeneous positive correlation between teaching 
effectiveness and research productivity „ Positive teaching evaluations 
correlate with increased research productivity. While the finding should 
not be interpreted as direct evidence of any causality between the 
variables, the evidence points to an association that deserves 
consideration. While the correlation is small, the association remains 
positive, suggesting that research productivity does not necessarily 
contradict efforts at quality teaching. The finding -warrants a more 
thorough understanding of those features associated with both increased 
research productivity and positive teaching evaluations. 
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Almost all universities evaluate faculty based on three standard 
areas: (a) research, (b) teaching, and (c) service. The trend in recent 
years seems to favor increasing the prominence and iirportance of research 
as a standard for tenure and merit for college professors (the publish or 
perish syndrome) . A long-standing controversy involves whether trade-offs 
exist between research and teaching for college faculty (Faia, 1976; 
Grant, 1971; Hammond, Meyer, & Miller, 1969; Harry & Goldner, 1972; Jauch, 
1976; Kurland, 1961; Iavis, 1992; Martin & Berry, 1961; Kodden, 1993; 
Schachter, 1991; Smith, 1961; Wbodburne, 1952) . As the institutional 
emphasis on research appears to increase, the controversy surrounding 
whether such increased expectations for faculty research productivity 
diminished the quality of teaching becomes more important. 

The reasons for increasing the emphasis on research are probably 
several and varied depending on the various internal pressures of the 
institution. The increased desire for outside funding places an errphasis 
on those attributes believed related to the potential for generating funds 
(research capability) (Burgoon, 1988) . One communication scholar 
(Burgoon, 1989) argues for a separation of those departments interested in 
research from those departments devoted to instruction (the place of 
service within this debate seems lost) . If (xaranunication departments 
emphasize research and external funding this may be deleterious to the 
quality of instruction provided by those departments. 

The belief or search for an objective merit system for promotion, 
tenure, and salary tends to enqphasize those things objectively 
quantifiable (number of publications) to those items perceived more 
difficult to objectify (quality of instruction) . The desire by 
administrations ard faculty for higher "status" leads to an emphasis on 
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research in a belief that more research productivity inproves status 
within the academic ccxnrnunity (for exanple several scholars in 
communication have commented on the need for scholars in the discipline to 
produce "research books") . Several efforts to quantify the contribution 
of communication scholars to the discipline exists that notes where 
prolific scholars obtained their degrees and the institutions currently 
employing these persons (Barker, Hall, Roach, & Underberg, 1979; 1980; 
1981; Barker, Ray, Watson, & Hall, 1988; Hickson, 1990; Hickson, Stacks, & 
Amsbary, 1989; 1992; 1993; Stacks & Hickson, 1983; Watson, Barker, Ray, & 
Hall, 1988) . This type of scoreboard system has not been without 
criticism (Brown, Blair, & Baxter, 1004; Erickson, Fleuriet, & Hosman, 
1993) , but the ability to provide a metric of carparison for research 
output leads to a sense of direct carparison of the value of various 
scholars. 

Ihe ability to create "status" for the quality of instruction seems 
difficult for the purposes of direct institutional comparisons. 
Identifying the quality of instruction becomes viewed by individual 
faculty as a political football within departments. Instructors become 
concerned that factors affecting the favorableness of student evaluations 
include: (a) the grade point average of the class, (b) the level of the 
class, (c) the size of the class, and (d) the content of the class. 
Without considering a host of potential moderating factors the ability to 
compare directly evaluations seems difficult. Some faculty the author has 
known refer to student evaluations of teaching as the "Nielson ratings" of 
acadesne. This perspective when combiried with the lack of public 
availability of records makes corparing departments or individual faculty 
from different institutions difficult, if not iirpossible. Ihe purpose of 
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this review is not to offer a direct critique or support for the validity 
of any methods of ixistructional evaluation. 1 This review only intends 
to use existing data as a basis for comparison. 

The arguments about trade-offs existing between teaching and research 
affect much of the discipline of cxrarunication as scholars engage in 
arguments about the future directions of departments. One argument 
concludes that increased status for cccminication departments core f ran 
the ability to generate money for research. Seme scholars argue that the 
inability of cxanrnunication researchers to generate valuable theory and 
effect the intellectual climate of other disciplines indicates a weakness 
of the field. Other academicians suggest the ability of ccramunication 
departments to attract quality students and provide them the skills for 
employment and empowerment create a strong justification for cxoranuriication 
departments. These arguments either directly or indirectly consider the 
connection between quality of instruction and research productivity. The 
concern taken with students and/or research, especially When evaluating 
the efforts of our faculty, require consideration of the relationship 
between research productivity and teaching effectiveness. Building a 
strong and effective ccramunication department requires that the 
philosophical underpinnings consider the requirements and potential 
trade-offs between research productivity and teaching effectiveness. The 
next two sections explore arguments about the nature of the connection. 

ARGUMENTS THAT A RESEARCH EMPHASIS 
UNDERMINES POSITIVE TEACHING EVAHJATTONS 
One issue advanced by those persons critical of requiring a great deal 
of research by faculty is that the trend away from the classroom and into 
research changes the role of college faculty. The argument runs that time 
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spent an research and writing becomes subtracted from the time spent with 
students in classes, laboratory sessions, and office hours. Mjch of the 
research arguably could never benefit the students (or society) and 
therefore the emphasis on research becomes misplaced. The research fails 
to improve classroom performance or the mind of the instructor. The 
research serves to increase the economic value and/or professional ego of 
the faculty member. Sykes, in his book Profscam (1988) argues 
particularly against cxxnraunication scholarship (see pp. 111-113 for his 
analysis of articles appearing in Human Communication Research and 
Southern Speech Conminication Journal ) . His argument becomes echoed by 
Bloom's (1987) argument about the tendency of research to benefit faculty 
numbers and no one else. This argument against mindless publication 
receives support from persons in communication (Erickson, Fleuriet, & 
Hosman, 1993) when they point out as a myth that "research is more 
rewarding than teaching" (p. 335) . When the pressures of producing 
research become an end, without true intellectual advancement, the 
expenditure of time in the endeavor may reduce time spent on other duties. 

While the faculty member conducting research enriches one's status in 
the university (among certain circles) , the result impoverishes the 
student in the classroom. The time devoted to the laboratory, word 
processor, conventions, editing, and/or library become subtracted from the 
time spent on lecture notes, advising students, and improving methods of 
evaluation (tests, papers, quizzes) . The publication becomes an end, 
rather than the idea the published article should contain. Scholars talk 
about the LFU (Least Publishable Unit) , the smallest amount of information 
that justifies a separate article (another listing on the annual report of 
activities arri a longer VITA) . The image of "publication monsters" 
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exists, generating paper trails into the nothingness of larger egos, 
benefiting no undergraduate (sometimes not even graduate) students, only 
the faculty member. 

This argument runs that there exists a zero sum resource called time 
with trade-offs existing between time spent on teaching and research. 
Most research information cannot provide insight to help students in the 
classroom or improve teaching. The professor creates a set of lecture 
notes, chooses a textbook, creates a standardized midterm and final, and 
simple paper assignments. The instructor creates a course that receives 
little updating, new thought, or originality. The textbook ccrapanies 
provide instructor manuals with exercises, overheads, videocassettes, 
tests, quizzes, discussion questions, all the material necessary for 
instruction. The teacher i.eed not think, but only act to convey material 
already supplied and organized for transmission and consumption. The 
teaching languishes (by remaining stagnant) while the research publication 
process continues. 

Another issue involves the separation that a research emphasis creates 
between the student and the instructor. Office hours, advising, and 
student activities require time from the instructor. A professor working 
on student-oriented activities (director of undergraduate studies, 
forensics director, mock trial, coordinator, adviser to a student-run paper 
or radio station, or other professional student organizations like Women 
in Communication Incorporated) takes time away from research. Research 
professors whose reputation, salary, and job security are not dependent on 
these activities could view working with students as a waste of time. 
Students become seen, as lab rats for experiments, research assistants, or 
some other type of research aid. The professor views nc:anal teaching as 
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onerous and nonclassrocra contact as undesirable. The very framework of 
the institutional pressures contributes to a deteriorating relationship 
between faculty and students. 

The previously described process reduces scores on teaching 
evaluations. As faculty spend more time away from the classroom and 
students, the quality of instruction languishes and erodes. Students 
became alienated frcm the faculty as the faculty become inaccessible and 
unavailable to students. The key to this line of argument is that faculty 
members possess little incentive to provide quality instruction and that 
without such an incentive the rewards of research become alluring. Since 
there exists no limit to the amount of research the professor can produce, 
and the more the better, the efforts of the scholar beccme heavily tilted 
favoring research (the creation of a giant paper chase) . 

ARGUMENTS THAT A RESEARCH EMPHASIS 
IMHROVES TEACHING EVALUATIONS 
The alternative line of argument suggests a connection between 
research productivity and teaching quality. Teaching many topics within 
the university, especially the field of communication, requires 
sophisticated knowledge of a constantly changing subject matter. A top 
researcher, arguably, provides better, up-to-date, and accurate 
information than someone not involved in research. This occurs because 
the researcher remains active and involved on the cutting edge of theory 
and knowledge while the nonresearcher does not. Ttds argument assumes a 
relationship betveen teaching and research permitting the researcher to 
bring the benefits of the research into the classroom. Research shews not 
only knowledge but shows a dedication to the content and material not 
possible for a person that only teaches. 
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The distinction exists between what might be considered the elementary 
and secondary versus college models for instruction. The argument is that 
college instructors must offer more than sinply quality instruction, the 
actual content of the information changes and the college instructor 
should contribute to that process of change. No one expects a high school 
or elementary teacher to generate kncwledge claims, the emphasis is placed 
on quality of instruction. 

Consider the nature of teaching seme content arr % at the college 
level. An instructor selects a textbook, writes lecture notes, tests, 
paper assignments, and relevant supporting documents for students. After 
three years, the instructor has a course to the point they feel is 
complete, rigorous, and intellectually stimulating. But theories and 
knowledge changes as more becomes knewn, considered, and reconsidered. 
After a few years the need to change particular features of the course is 
necessary. Consider a recent examination of textbooks using meta-analyses 
(Allen & Preiss, 1990) to evaluate their accuracy. That article in part 
required ccanmunication textbook authors to examine research more closely 
and a reexamination two later editions of those textbooks (Osborn & 
Osborn, 1994; Ross, 1992) provides examples of changes in the treatment of 
particular research literature, iitproving the accuracy of textbooks. 

Specialization involved in research permits greater depth of knowledge 
for students to capitalize on. The ability of instructors to act as 
resource persons depends on the ability to grasp current material. The 
faculty member involved in research benefits the student and the ootraunity 
by providing such material. The enthusiasm necessary for research shculd 
spill over into the classroom as the researcher provides a comitinent and 
a stimulation not possible if they were not engaged in such research. 
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EXAMINING THIS ISSUE EMPIRICALLY 
A host of quantitative examinations on this issiie permits a synthesis 
using meta-analysis. Because a large number of investigations exists on 
this issue should not be surprising since the data to examine the 
relationship is routinely collected. Teaching evaluations (student, peer, 
and administrative) occur with regularity and much effort involved in 
gathering and accumulating such information. The normal process of merit, 
tenure, and promotion require seme effort at the documentation of 
instructional quality. Research productivity generally receives 
evaluation within the faculty merit system, often yearly. The system 
generates essentially a regular system of records permitting a researcher 
to compare the various methods of rating research productivity. The basis 
typically takes seme form of quantitative measure of the amount of 
research (either in grant dollars or number of publications) produced by a 
scholar. 

A previous meta-analysis does exist on this topic (Feldman, 1987) . 
However, there exists many justifications for both replicating and 
extending the earlier analysis. First, no literature search method 
details exist in the manuscript, making it impossible to knew to what 
extent the author searched the relevant literature and under what 
conditions. There is no way of knowing the procedures of the literature 
search, what definitions used for the process, nor the success of the 
literature search. A literature search should be explicit to permit 
evaluation of the methods, completeness, and adequacy (Preiss & Allen, 
1994) - 

Second, the report provides no homogeneity test for the average 
correlation produced war conducted. This lack of testing constitutes a 
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serious weakness because the ability to interpret any results of a 
meta-analysis depends on a negative outcome for the test of moderator 
variables. Feldman (1987) used an. early form the technique developed by 
Glass, MoGaw, and Smith (1981) that did not provide a test for moderator 
variables. The interpretation of the mean effect size in any 
meta-analysis depends on the homogeneity of the mean correlation (Dillard 
& Hale, 1993; Hunter & Schmidt, 1991) . The interpretation of an average 
correlation based on a heterogeneous set of correlations would be like 
interpreting a main effect in an ANOVA in the presence of an interaction 
(Hunter, Hamilton, & Allen, 1989) . The interpretation of a main effect 
under those conditions remains problematic (Winer, 1971) . The failure to 
perform a test indicates that no information is kncwn about the possible 
existence of the moderator variables and any conclusion remains limited. 

Third, Feldman 1 s (1987) treatment of data reports with incorplete 
information was to exclude the data set* If a manuscript reported a 
nonsignificant finding Feldman did not report any effect for the data. 
l ihe procedure creates a potential upward bias in the average correlation 
because it systematically excluded data with small effects that tend to be 
nonsignificant, the resulting average correlation represents an 
overestimate of the true effect. Other procedures and possibilities 
remain to consider recovering and including data within the meta-analysis 
literature (Allen, Hunter, & Donohue, 1989; Boster & Mongeau, 1984). 

Finally, all research requires replication. Even a meta-analysis 
requires replication before accepting the results as definitive or 
authoritative (Allen & Preiss, 1993) . If any error exists by the person 
cortfucting the analysis, the replication should reveal the error and 
permit an assessment of it. Such replications occur frequently using 
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meta-analysis on a variety of topics: (a) persuasiveness of fear appeals 
(Boster & Mongeau, 1984; Mbngeau, 1994; Sutton, 1982), (b) the consistency 
of attitude-behavior relationships (Kim & Hunter, 1993a; 1993b; Sheppard, 
Hartwick, & Warshaw, 1988) , (c) the iirpact of foot-in-the-door or 
door^in-the-face appeals (Beamon, Cole, Preston, Klentz, Steblay, 1988; 
Dillard, Hunter & Burgoon, 1984; Fern Monroe, & Avila, 1986) and the (d) 
persuasiveness of message sidedness (Allen, 1989; 1993; O'Keefe, 1993). 
The replications offer additional perspectives and reaffirmation or 
challenge to t findings. Scientific advancement depends on findings 
that demonstrate replication, even for meta-analysis. 

METHOD 
Literature Search 
The literature search conducted used both a manual and CD ROM 
(Silverplatter) search of ERIC , Psvchlit, and Cdnrnlndex . A manual search 
was conducted on the Educational Research Index . The key words used were 
"faculty evaluation" and "faculty pronation" as well as "productivity." 
All manuscript's references sections found by this search received 
examination for possible additional sources of information. No 
manuscripts existed that were authored by cxxnmunication scholars, or 
focused particularly on cxraraunication departments. However, more than 75% 
of the data sets used faculty across departinents at institutions, 
suggesting that while cxromunication departments were not the primary focus 
of the investigations, communication faculty were undoubtedly included 
within this report. For inclusion in this meta-analysis a manuscript had 
to contain the following set of information: 

(a) a measure of teaching effectiveness; 

(b) a measure of research productivity; and 
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(c) statistical information pexmittiiig the calculation of an effect 
size, manuscripts that presented redundant data sets were not 
included in this analysis as separate data points. 

Coding of Studies 
Studies were coded for relevant moderator features: (a) year of 
investigation, (b) method of teaching evaluation, and (c) method of 
measuring productivity for research. 
Year 

One feature of interest in this analysis is whether the year of data 
collection would moderate the analysis. The argument runs that the 
changing emphasis on research within the academic community may change the 
underlying relationship between research and teaching. This moderator 
analysis would test this assumption. The actual date of data collection 
and not the date of publication remains the important feature. Therefore 
the earliest public presentation date became the date used if the actual 
date of data collection was not provided. One study (Linshy & Straus, 
1975) collected data over a 20 year period and became excluded from this 
analysis, the band of years was too wide to permit a single date to be 
entered for this estimate. All the coding in this section agreed with 
information provided by Feldman (1987) (for those manuscripts he reports) . 
Method of Measuring Evaluation of Teaching 

The method of teaching evaluation should be considered because of the 
potential differences in assessing instructional effectiveness. Seme 
studies use: (a) student evaluations, (b) peer evaluation, (c) a 
nomination or receipt of an award for teaching, and (d) measurement of 
teaching related activities. The coding of this information agreed with 
the representations by Feldman (1987). 
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Method of Measuring Research Productivity 

The method of considering hew productive a scholar's research output 
can be measured in a variety of methods. Sane methods use a ccunting 
(weighted or unweighted) involving the sheer number of publications, other 
methods rate on a scale (1 to 5) the productivity of the person utilizing 
a faculty panel. Seme measures consider the quality of publications as an 
index of productivity (using the Social Sciences Citation Index as a 
measure of quality by counting the nurcber of citations of a scholar's 
work) . One measure of productivity involved the number of grants a person 
had received as a faculty member. A final index takes the publications 
and weighs the value of the research in seme manner (for example, a book 
is 25 point, single-authored article in national journal is 10 points, 
co-authored national article is 7 points, etc. ,) . The measurements of 
research coding used the following scheme of eight values: (a) number of 
publications, (b) grants awarded, (c) number of citations, (d) peer or 
chair rating of research, (e) time spent on research, (f) awards for 
research, (g) combination of grants and publications, and (h) the research 
creativity of the scholar as rated by other faculty. The coding in this 
section agreed completely with that information provided by Feldman 
(1987). 

Statistical Analysis 
Statistical analysis requires three steps. First, the statistical 
information within the primary investigation becomes converted to a cannon 
metric for comparison. The formulas for such conversions are well 
established and present few unique problems (Hunter & Schmidt, 1991; 
Rosenthal, 1984) . 3 The metric chosen for this study was the correlation 
coefficient. The correlation represents a metric easy to transform and 
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interpret. Statistical information with each study was corrected for 
artifacts of measurement such as attenuated measurement or dichotatiization 
of independent or dependent variables (Hunter & Schmidt, 1991) . A 
complete listing of the studies and the correlations appears in Table 1. 
Table 2 contains the coded features for each individual study and 
individual estimates for features Where necessary. 

Second, the separate estimates fron each report must be averaged. The 
averaging process uses a weighted average that considers the sample size 
of the study. The procedures for weighted averaging are standard and used 
by almost all methods of meta-analysis ( Bargert-Drowns , 1986; Hedges & 
Olkin 1985; Hunter & Schmidt, 1991; Rosenthal, 1984; Wolf, 1986). 

The final step involves testing the average estimate for 
homogeneity. The test assumes that the average correlation carves from a 
single distribution of effect sizes normally distributed. The differences 
in effects (correlations) should only occur because of random saitpling 
error. If that is true, then the chi-square statistic will not be 
significant, an insignificant chi-square suggests homogeneity. This 
method of meta-analysis has been called variance-centered method of 
meta-analysis (Bargert-Hrowns, 1986). 

The follow-up test for moderators based on a heterogeneous finding for 
an average correlation requires two complete steps to account successfully 
for any heterogeneity found (Hall & Rosenthal, 1991). First, there should 
be within group homogeneity for a successful moderator. If the variable 
is categorical, each category should demonstrate homogeneity within 
categories. This homogeneity shews that the averages for each level 
represent averages based on individual correlations that differ only 
because of sairpling error. The second requirement is that there should 
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exist significant differences in the mean correlations between groups. If 
the groups are different then the correlations for each group should be 
different* This process represents the classic homogeneity within groups 
and heterogeneity between groups that supports the use of analysis of 
variance techniques. 

More sophisticated techniques involve the use of analysis of variance 
on the findings of a meta-analysis (Allen, 1989) or the use of multiple 
regression to test the contribution of moderators (Dindia & Allen, 1992) . 
Hois analysis contained relatively few studies and not equally distributed 
across the categories, therefore no such analysis was conducted. The use 
of ANOVA or multiple regression in the earlier cited meta-analyses 
involved over 200 investigations, and even then sane cells were combined 
or eliminated due to small cell size. The possibility of interactions 
between moderators remains untested. 

EESUiirs 

Overall 

The overall average correlation (ave r = .107, k = 46, N = 64,925) 
demonstrates a positive relationship between teaching evaluation scores 
and research productivity. The relationship observed was not based on a 
homogeneous set of correlations (X 2 ( 45 j = 117.94, p < .05) . This 
indicates that the average effect should be interpreted with caution since 
at least one moderating variable probably exists. IXie to the nature of 
the data sets (one study with an extremely large sample size, Faia = 
53,034) and several data sets included as a zero correlation (Cornwell, 
1974; Grant, 1971; Lasher & Vogt, 1974; Plant & Sawrey, 1970; Ratz, 1975; 
Teague, 1981; Voeks, 1962) as well as those significant correlations 
but no directions reported (Cornwell, 1974; lasher & Vogt, 1974) , a series 
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of subsidiary analyses tested whether the coding of the studies can offer 
a sufficient explanation for the heterogeneity . 4 

Year of the Study as a Moderator 
A correlation was calculated between the size of the correlation and 
the year of the study* A positive correlation indicates that the 
association between teaching evaluations and research productivity is 
growing* A negative correlation shows that the association between 
research productivity and teaching evaluations diminishes with time. The 
analysis shews that for both the sample size weighted (ave r = -.08) and 
unweighted analysis (ave r = -.04) a snail negative correlation between 
the size of the association of the year of data collection exists. 
Basically, little change occurs over time. A secondary analysis examined 
whether the correlations changed over the various decades. The studies 
were separated into four groups: (a) before 1960, (b) 1960's, (c) 1970' s, 
and (c) post-1980. The groups show that the general trend of the 
correlations is negative from a high correlation in the pre-1960 studies 
(ave r = .209, k=3, N=673, X 2 ( 2 )= 23.98), to a smaller correlation in 
the 1960 f s decade (ave r = .095, k=13, N=3030, X 2 ( 12 j =22.16) , to a 
sli^itly larger correlation in the 1970 f s decade (ave r = .112, te=24, 
N=58,375, X 2 ( 2 3) =32. 99) and the smallest average correlation occurring 
in the post-1980 studies (ave r = .068, k?=5, N=1798, X 2 ^ =9.116) . 
Unfortunately, not all the groups are homogeneous so this solution fails 
to act as a sufficient moderator. 

Evaluation Method for Teaching 
One issue is whether the method used to evaluate the quality of 
teaching acted as a moderator in the results. The five methods of 
evaluating the quality of instruction were: (a) student evaluations, (b) 
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peer evaluations, (c) teaching awards, (d) amount of teaching related 
activities, and (e) conbinations of methods. The average correlation for 
student evaluations (ave r = .082, k=37, tt=l 1,177) was positive and 
heterogeneous (X 2 ( 36 j= 86,65, p < .05). This indicates that there 
still exists a moderating condition within this group. 

The peer evaluations were also heterogeneous (X 2 ^=19.76, p < 
.05) and demonstrated a positive correlation (ave r = .320, k=6, ^=685) . 
Faculty peers provided a larger correlation between teaching and research 
productivity than did the average student evaluation correlation , Some 
type of halo effect may exist for faculty rating other faculty, or 
possibly indicate that one professional judgment of another professional 
stems from an urrierstanding of the content and technique beyond that of a 
naive student observer. 

The teaching awards create a positive correlation between winning a 
teaching award and research productivity (ave r - .110, k=5, N=53,337) and 
the effect was homogeneous (X 2 ^ 4 p2.70, p > .05) . For the teaching 
related activities (ave r = .325, te=l, N=L74) and oxibinutions of ratings 
(ave r = .199, k=l, N=27) only one study used each method and no 
homogeneity test can be conducted. 

This moderator fails to act as a sufficient set of groupings to 
permit interpretation of the data. It should be noted that for all groups 
the correlation was positive and the absolute magnitude not that 
dissimilar whe*> multiple studies appeared in the group. 

Method of Measuring Research Productivity 
There were eight different methods used to measure research 
productivity: (a) number of publications, (b) grants awarded, (c) number 
of citations, (d) peer rating, (e) time spent on research, (f) research 
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awards won, (g) combination of grants and publications,: and (h) research 
creativity. The largest number of studies (kKJl) examined the number of 
publications. This method of measuring research productivity was 
positively correlated (ave r = .109, N=62,507) with teaching effectiveness 

2 

but came from a heterogeneous sairple of correlations (X£ 30 ) =64.75, p 
< .05). 

The number of grants awards generates a slightly higher correlation 
(ave r = .173, ]o=2, N=211) with a homogeneous sairple (X 2 ^ 1 j=0.60, p > 
.05) . Given the small number of studies and total sample size, any 
definitive conclusions remain difficult to draw fron this data. 
Homogeneity also existed for the relationship between the number of 
citations and teaching evaluations (ave r = -.032, fc=5, N=1036) 
(X 2 ( 4 )=6.85, p > .05) . This constitutes the only average correlation 
that was negative but the small number of studies; and sairple size make it 
difficult to draw firm conclusions. 

The peer rating of research demonstrates a typical relationship with 
teaching evaluations (ave r = .124, k=7, N=858) based on a homogeneous 
sairple (X 2 ( 6 )=6.40, p > .05). The sairple sizes and correlations for 
the research time group (ave r = .000, k=2, N=4.558, zero variance) and the 
conbination grants and publications group (ave r = .030, k=2, N=160, 
X 2 ^=0.32, p > .05) were small. The research awards group (ave r = 
.000, k=l, N=15) and research creativity- group (ave r = .540, k=l, N=86) 
contained only one study apiece. 

Methods of measuring research productivity as a moderator fail to 
account for the available heterogeneity. The number of studies available 
remained too small in many categories to justify their existence and the 
only category with a large number of studies exhibits significant 
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heterogeneity . 

O0NCIUSICNS 

The results show a positive correlation between teaching 
effectiveness and research productivity (ave r = .107) . Hie correlation 
indicates that as either teaching effectiveness or research productivity 
increases the other variable does as well. Only a correlation exists, no 
ability to evaluate any causality between the features is possible given 
the restricted set of conditions of the data. Such an interpretation 
requires the introduction of some type of theoretical mechanism that 
translates the increasing value of one variable into an effect that 
increases the value of the other variable. While arguments exist about 
the nature of the connection, the exact causal connection remains 
unclear. The ability to infer that a causal connection exists between 
research productivity and teaching effectiveness does not exist within 
this report. While such a connection is a necessary condition for an 
interpretation of causality, the correlation is not a sufficient 
condition. 

There are two considerations that need to be addressed when 
considering the impact of the connection observed. First, the average 
correlation comes from a sairple of heterogeneous effects, therefore any 
interpretation must be cautious. The inability to generate a homogeneous 
solution using the moderators provides some uncertainty about the ability 
to generalize the average observed. The only exception to the rule that 
all average correlations observed were positive or zero occurred for the 
number of citations used for a measure of research productivity. Most 
likely the "true 11 moderator variable would separate one smaller positive 
effect from a larger positive effect. While this separation is inportant 
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it is a distinction between magnitudes of positive correlations not the 
direction of the correlation. It seems warranted to expect, for exanple, 
that the sheer number of publications is positively correlated with 
teaching evaluations, even when considering any moderating variables. 

The second issue considers the importance of the size of the average 
correlation. Many would consider a correlation of .10 as snail and 
uninportant, contributing to little of the variance. Abelson (1985) 
points out that there exists much misconception about the importance of 
what amounts to small effects. For example, the difference between a .200 
and a .300 hitter in baseball for any one at bat is .00317 (using onega 
squared) and the effect observed in this meta-analysis is far larger than 
that difference. 

The importance of the connection can be illustrated in terms of 
utilizing a Binomial Effect Size Display (BESD) . Suppose we have 200 
faculty members and 100 do research and 100 do not. The overall teaching 
evaluation average across all 200 faculty is 50 with a standard deviation 
of 10. We have their teaching evaluations and we find the correlation 
between coixJucting research and teaching evaluations is .10 (the same as 
the average in this meta-analysis) . The mean for the research faculty for 
teaching evaluations would be 51 and for the nonresearch faculty the mean 
would be 49 (this is based on d= .20, which is the same as the correlation 
of .10, d = (difference between means or 2) / (standard deviation or 
10) ) . This is not a large difference but suppose we want the top 16% of 
the teaching (one standard deviation above 50 or a cutoff score of 60) you 
would find that 18.41% of the research faculty above that score while only 
13.57% of the nonresearch faculty above that score. Suppose we want the 
very best teachers and use a three positive standard deviation cutoff 
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score (score of 80) we would find that 0.19% of the research faculty and 
0.10% of the iK>nresearch faculty achieving that goal. For the last severe 
cutoff score there is an almost 2 to 1 advantage in the probability of a 
research faculty number attaining a high ranking when cctnpared to a 
ronrasearch faculty member. Table 3 provides an extended statistical 
summary of the various features of the ircpact of a correlation of .10 on 
the comparisons. 

The practical implications of the finding deserve seme 
consideration. While research is not a perfect indication of high quality 
teaching, clearly productive research is not inconsistent with quality 
teaching. More than likely there is at sone point a level of dindnishing 
returns where research efforts operate to reduce the quality of teaching 
but that point is not developed in this data. The data do clearly support 
the idea that research productivity and quality teaching are not 
contradictory goals, the degmee to which they are compatible or 
ccaiplenentary goals could still be argued. 

One interpretation implies the existence of some third factor 
underlying both positive teaching evaluations and research productivity. 
Scane underlying personality or professional characteristic creates an 
outcome that favors both quality teaching and high research productivity. 
One interesting issue develqped in an early study (Maslcw & Zimmerman, 
1956; was the high correlation between research creativity and teaching 
evaluation. If one were to list those features contributing to successful 
research (creativity, hard work, etc.) and compare them to those features 
contributing to successful teaching (hard work, ability to explain) we 
might find many features in common. 

A second explanation might be that the nature of the teaching 
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evaluations interacts with time spent on teaching. Suppose that 
researcher's classes becone routine standardized exercises that vary 
little from year to year and the faculty member gives high grades. The 
researcher might well be evaluated positively, while simultaneously 
providing little inspiration to improve the students. The relationship 
between positive overall evaluations and effectiveness as measured by 
actual learning and understanding material is unclear. If the argument 
that research degrades quality teaching the argument turns on seme feature 
of the methodological aspects of evaluation being associated with research 
productivity . 

Future research should consider the issues about the impact of 
research and the actual process of learning. Communication scientists 
could make an impact by examining hew researchers interact within the 
classroom and whether that interaction is different from republishing 
faculty. Teaching is an interactive process between student and teacher, 
where the teacher provides knowledge as well as instills curiosity within 
the students. One issue of Communication Education (edited by Lawrence 
Rosenf ield in 1994) contains stories about the nature of hew teachers 
interacted with students to create a positive influence. None of the 
stories or commentaries provides a connection between good research and 
good teaching. 

The ongoing debate over what duties a faculty member should be 
responsible for continues to receive attention. This summary provides 
some information about one relationship, between positive teaching 
evaluations and research productivity, by finding a positive 
relationship. The results provide a preliminary conclusion that there 
exists a small positive relationship between teaching and research. The 
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findings do not end the argument about the nature of the trade-off but 
they do provide evidence that the trade-off is not inevitable. Our 
departments need to consider this aspect of the relationship before 
establishing standards for hiring, merit, promotion, and tenure. 
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FOOTNOTES 

4he arguments surrairxiing the measurement and documentation of 
effective as well as positively evaluated teaching deserve consideration 
but are beyond the scope of this report. For discussion of these issues 
the reader is referred to other relevant meta-analyses (Feldman, 1976a; 
1976b; 1977; 1978; 1979; 1983; 1984; 1986), 

2 Several data sets were available in multiple manuscripts (Aleamonia 
and Yimer, 1972; 1973; 1974; Freedman, Stunpf & Aguanno, 1976; Friedrich & 
Michalak, 1983; Hoffman 1984a; 1984b; Michalak & Friedrich, 1981; Stunpf, 
Freedman, & Aguanno, 1979; Wood, 1977, 1978)- Several manuscripts only 
offered reviews of the available literature (Blackburn, 1972; Braxton & 
Bayer, 1986; Schachter, 1991) useful for bibliographic purposes but did 
not contribute data sets. Data reported by Aiken (1975) were not included 
because no sample size was reported. 

**There were several discrepancies between the correlations Feldman 
(1987) reports using and what are used in this report. For example he 
provides no estimate for Ahem (1969) or Goldsmid, Gruber, and Wilson 
(1978) although a significant correlation exists, this report estimates a 
correlation that would be minimally significant given the available sample 
size. Feldman reports a correlation of .30 which is the value of the 
student evaluations, the correlation used here is .255 **iich is the 
average of the student and the peer evaluations (.210) , this procedure 
also accounts for the discrepancy between the Friedrich and Michalak 
(1983) correlation. Feldman also reports a sample size of 211 for the 
Harry and Goldner (1972) study, the indication on table 2 of the article 
is that every correlation has a different relevant sample size, the 
relevant sample size for this correlation is not 211 but 77. The Hayes 
(1971) study had multiple sample sizes for each method of teaching 
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evaluations, this report uses the average sanple size. Feldman did not 
consider the Hoyt (1974) or Rossman (1976) measures as an indication of 
teaching evaluation since it asked students to rate hew much they had 
improved because of the course in the subject matter. For this summary 
the measure was considered appropriate as a student evaluation of a 
course. Several estimates are larger in this report than those in the 
Feldman report (see for exairple Marquardt, MoGann, and Jokubauskas study 
(.280) verus that provided by Feldman (.250)) because of application of 
the correction for measurement error due to attenuation. Conparing the 
uncorrected estimates of this report to the uncorrected estimates provided 
by Feldman finds complete correspondence except where noted. Feldman 
provides no estimate for the Riley, Ryan, and Lifshitz (1950) study 
because no calculation is possible. This investigation calculates 
percentages of published versus unpublished faculty obtaining superior 
teaching ratings. Ihis permits the calculation of a d statistic using the 
table in Glass, MoGaw, and Smith (1981) . In estimating the effect for 
Stavridis, Feldman used only a few of the student evaluation items, this 
report used a much later percentage of the items and the correlation 
changed from .240 to .163 (it should be noted that had the same items been 
utilized the correlations would have been identical) . 

4 lhe subsidiary analyses demonstrate that despite the procedure used, 
the same basic results were obtained. For exanple, the analysis conducted 
by excluding the Faia study reveals the same basic results (ave r = .094, 
k = 46, N = 11,933 with honogeneity results of (X 2 ( 45 ) = 115.438, p < 
.05) ♦ Conducting the analysis without studies whose values could not be 
calculated exactly and where estimates of zero were entered (Cornwell, 
Teague, Voeks, Grant, Lasher, Ratz, Plant) demonstrates no divergent 
results (ave r = .111, k = 39, N = 63,013 with homogeneity test X 2 (38) 
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= 87.847) . Another analysis conducted renewing the studies with valued 
entered at zero and entering either minimal positive values for studies 
with significant but nondirectional findings (Lasher and Oomwell) 
utilizing the procedures outlined by Boster and Mongeau (1984) reveals no 
difference in results (ave r = .112) or when using minimal negative values 
either (ave r = .110) . Except where reported in the text, all subsequent 
moderator analyses conducted the same procedure for inclusion and 
considering the possible influences of coding procedures. Except where 
indicated in the text no differences existed based on this analysis. A 
complete detailed series of the analyses is available fron the author. 
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Table 1 

Effect Sizes Relating Research Productivity and Teaching Effectiveness 



Author^ 


Datsr 


_3 
XT 


N 


Ahem 


1969 


• 238 


75 


Aleamoni (1) 


1973 


.000 


360 


(2) 




.033 


28 


Bausell 


1972 


.061 


105 


Rraunstein 


1973 


.040 


349 


Braxton 


1983 


.325 


174 


Eresler 


1968 


•227 


106 


Centra (1) 


1983 


.099 


2,968 


(2) 




.073 


1,623 


Clark 


1973 


.255 


45 


Cornwall 


1974 


.000 


70 


Dent 


1976 


.022 


90 


Faia 


1976 


.110 


53,034 


Freedman 


1979 


.242 


129 


Frey 


1978 


.070 


42 


Goldsmid 


1977 


.172 


90 


Grant 


1971 


.000 


DoD 


Harry 


1972 


.190 


11 


Hayes 


1971 


.210 


OCA 

250 


Hicks 


1974 


.192 


459 


Hoffman 


1984 


-.250 


65 


Hoyt 


1974 


.086 


1/3 


Hoyt 


1976 


. JL/U 


loJ 


Lasher 


1974 


.000 


0*70 
8/3 


Linsky 


1975 


.009 


1,091 


Marquardt 


1975 


.286 


91 


Maslcw 


1956 


.640 


OD 


McCullagh 


1975 


. 045 


CO 


McDaniel 


1970 


.043 


/6 


Michalak 


1981 


*260 


86 


Plant 


1970 


. UUU 




Ratz 


1975 


.000 


lb 


Richardson 


1992 


.260 


67 


Riley 


1950 


.220 


389 


Root 


1987 


.199 


27 


Rossman 


1976 


.327 


122 


Rushton 


1983 


-.066 


52 


Siegfried 


1973 


.039 


45 


Stallings (1) 


1970 


.260 


128 


(2) 




.105 


121 


Stavridis 


1972 


.163 


32 


Teague 


1981 


.000 


16 


Usher 


1966 


.230 


26 
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Table 1 (Cont.) 



Voeks 

Wood 

Wood 



1962 
1976 
1978 



.000 
.395 
.023 



198 
69 
22 



•'■First author listed, see References for corrplete citation 
2 Date listed is date of publication not data set 
3 Correlation reported is corrected correlation averaged across 
multiple measures. When the correlation differs from that reported by 
Feldman. (1987) an explanation is provided in Footnote 2. 
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Table 2 

Methods of Measuring Research Productivity and Teaching Effectiveness 



Method of Measuring 



Author^ 


uatie 


roesearcn 


i^eacxixjig 






Froaucuxvity 


Ci r recuxveness 


Ahem 


XybJ— b4 


# or pun 


awaluS 


Aiearaoni (l) 


xyby— 7u 


# or puD 


Swuoenu 


(2) 


1969—1970 


f or puo 


peer 


Bausell 


1969"" /0 


iff Or pUO (»U4X; 








grants ( • xxy ; 




Braunstexn 


xybo— by 


peer 


SLujacIlt 


Braxton 


1979 


or puo 


acuxVXuxes 


isresxer 


JLyOD 


grants 




Centra (1) 


xy /y 


or puo 




(2) 


iyou 


ft or puo 




Clark 




ft yJL. fJvUJ 


poci ^.^XL/y 










UOXI IWfci-LX 


1 Q7? 


UCCi. 








# n*f r»ib f OR! ) 

ft ui wUk/ i * x y 








citaLions ^ • UUo ) 




* 

Faia 


Xy / 


ft Ul kAiJJ 


uWCLLUO 


rXT©eCuilan 


xy /y 


ft ui puu 




ri. tiy 


1Q7tv-7fi 


r» "i ■f*A"{" "i r»rjct 


student 


KjOXOSHLXJCL 


xy / z /ft 


ft Ul ^AiJJ 




ox all L 


Xi/OO 


+* T TTV7V 


DIAJLlCilU 


nCLLX. 




# of nub 


student 






H of raib 

ft «i Jb"^ 


Deer f„289. 318) 






stixient f.073. 183) 




\ xy / H ) 


ft ui puu 


O UL*.tV4 IS* 


no j. jLiiicu l 


1-/OU 


i of nub 

ft ux uuw 


student 


nuyC 


xyoy 


ft Ol £AiiJ 








peer 


student 


Lasher 


1969 


time 


student 


Linsky 


(1955-75) 


# of pub (.040, 1422) 


student 






citations (-.050, 766) 




Marquardt 


1972 


peer 


student 


Maslow 


1943-46 


creativity 


student (.510) 








peer (.77) 


McCullugh 


1971-72 


# of pub 


student 


McDaniel 


(1970) 


# of pub 


student 


Michalak 


1977-78 


# of pub (.320) 


peer 






citations (.200) 




Plant 


(1970) 


pub and grant 


student 


Ratz 


1969-75 


awards 


student 


Richardson 


(1992) 


# of pub 


student 
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Table 2 (Cont.) 



Riley 


1947 


# of pub 


student 


Root 


1985 


peer 


combination 


Rossman 


1969 


peer 


awards (.190) 








peer (.230) 


Rushton 


1974-79 


# of pub (-149) 


student 






citations (-.280) 




Siegfried 


1970-71 


# of pub 


student 


Stallings (1) 


1965-66 


# of pub 


student 


(2) 


1967 


# of pub 


student 


Stavridis 


1972 


# of pub 


student 


Teague 


(1981) 


# of pub 


awards 


Usher 


1965 


peer 


student 


Voeks 


1948-52 


# of pub 


student 


Wood 


1971-73 


# of pub 


student 


Wood 


1974-77 


pub and grant 


student 



1 First author listed, see References for complete citation 

2 Date listed is date of actual data collection not publication date, 

if date is in parantheses that indicates publication date because actual 

data collection date not available. 

3 First number in parathesis is correlation, second number is sairple 

size if different from overall. No entry means overall correlation is 

based on this value. 
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Table 3 

Binomial Effect Size Display for Interpreting Results 



The following assumes that r = .10 and a scale with a mean = 50, standard 
deviation = 10 and that each faculty group (research and nonresearch) is 
equal in number* 



Cutoff score 



Percentage of 
faculty past the 
cutoff score 



Ratio of 
research to 
nonresearch 
faculty 



Percentage chance 
faculty member 
past cutoff score 
is research 
faculty mesnber 



research nonresearch 



Greater than the 
mean 

Above average 55,00% 45.00% 1.22 to 1 55% 

Greater than 
one standard 
deviation 

"Excellence" 18.41% 13.57% 1.36 to 1 58% 



Greater than 
two standard 
deviations 

"Outstanding" 2.87% 1.79% 1.60 to 1 62% 



Greater than 
three standard 
deviations 
"Teacher of 

the year" 0.19% 0.10% 1-90 to 1 66% 
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